Goto

Collaborating Authors

 Biomedical Informatics


China races to build record biobank to rival U.S. drugs research

The Japan Times

China races to build record biobank to rival U.S. drugs research Biobanks store masses of biomedical data such as clinical records, genome sequences and other long-term health metrics that research and drug development depend on. As a fledgling researcher in U.S., Zhang Li was struck by the efficiency of extracting human tissue in the morning and mining it for data the same afternoon. Such a streamlined process had been missing from his years of training as a bio data scientist in China. Inspired, he returned home to Beijing to join the Chinese Institute for Brain Research and launch a national database that will collect blood and DNA samples from 33,000 children to help identify patterns of brain disease and their risk factors. "Biomedical data is extremely valuable and is fundamental for us to find solutions to diseases and to delay aging," said Zhang, surrounded by robotic arms carefully organizing blood samples.


Biconvex Biclustering

Rosen, Sam, Chi, Eric C., Xu, Jason

arXiv.org Machine Learning

This article proposes a biconvex modification to convex biclustering in order to improve its performance in high-dimensional settings. In contrast to heuristics that discard a subset of noisy features a priori, our method jointly learns and accordingly weighs informative features while discovering biclusters. Moreover, the method is adaptive to the data, and is accompanied by an efficient algorithm based on proximal alternating minimization, complete with detailed guidance on hyperparameter tuning and efficient solutions to optimization subproblems. These contributions are theoretically grounded; we establish finite-sample bounds on the objective function under sub-Gaussian errors, and generalize these guarantees to cases where input affinities need not be uniform. Extensive simulation results reveal our method consistently recovers underlying biclusters while weighing and selecting features appropriately, outperforming peer methods. An application to a gene microarray dataset of lymphoma samples recovers biclusters matching an underlying classification, while giving additional interpretation to the mRNA samples via the column groupings and fitted weights.


US's new scramble for Africa is biomedical imperialism

Al Jazeera

US's new scramble for Africa is biomedical imperialism Late in February, Zimbabwe pulled out of a proposed $367m United States health funding agreement after objecting to provisions requiring broad American access to sensitive health data. The five-year programme was presented as support for HIV/AIDS, tuberculosis, malaria and epidemic preparedness efforts. However, the terms demanded extensive sharing of national health intelligence, including epidemiological surveillance data and pathogen samples, while offering no binding guarantees that Zimbabwe would receive equitable access to medical technologies developed from them. Harare called the proposal an "unequal exchange", warning that Zimbabwe risked supplying the "raw materials for scientific discovery" while the resulting benefits could remain concentrated in the United States and global pharmaceutical firms. Critics increasingly describe this pattern as biomedical extractivism: a toxic combination of exploitative research practices and colonial thinking that reinforces Western dominance.



GV-Rep: A Large-Scale Dataset for Genetic Variant Representation Learning

Neural Information Processing Systems

The development of deep learning approaches for modeling these multifactorial effects of GVs is still in its nascent stages, primarily due to the lack of comprehensive datasets that capture the intricate relationships between GVs and their downstream effects on complex traits.





EHRSHOT: An EHR Benchmark for Few-Shot Evaluation of Foundation Models

Neural Information Processing Systems

We help address these challenges through three contributions. First, we publish a new dataset, EHRSHOT, which contains de-identified structured data from the electronic health records (EHRs) of 6,739 patients from Stanford Medicine.